{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Amazon SageMaker Object Detection using the Image and JSON format\n",
    "\n",
    "1. [Introduction](#Introduction)\n",
    "2. [Setup](#Setup)\n",
    "3. [Data Preparation](#Data-Preparation)\n",
    "  1. [Download data](#Download-Data)\n",
    "  2. [Prepare Dataset](#Prepare-dataset)\n",
    "  3. [Upload to S3](#Upload-to-S3)\n",
    "4. [Training](#Training)\n",
    "5. [Hosting](#Hosting)\n",
    "6. [Inference](#Inference)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introduction\n",
    "\n",
    "Object detection is the process of identifying and localizing objects in an image. A typical object detection solution takes in an image as input and provides a bounding box on the image where an object of interest is, along with identifying what object the box encapsulates. But before we have this solution, we need to acquire and process a traning dataset, create and setup a training job for the alorithm so that the aglorithm can learn about the dataset and then host the algorithm as an endpoint, to which we can supply the query image.\n",
    "\n",
    "This notebook is an end-to-end example introducing the Amazon SageMaker Object Detection algorithm. In this demo, we will demonstrate how to train and to host an object detection model on the [COCO dataset](http://cocodataset.org/) using the Single Shot multibox Detector ([SSD](https://arxiv.org/abs/1512.02325)) algorithm. In doing so, we will also demonstrate how to construct a training dataset using the JSON format as this is the format that the training job will consume. We also allow the RecordIO format, which is illustrated in the [RecordIO Notebook](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/object_detection_recordio_format.ipynb). We will also demonstrate how to host and validate this trained model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup\n",
    "\n",
    "To train the Object Detection algorithm on Amazon SageMaker, we need to setup and authenticate the use of AWS services. To begin with we need an AWS account role with SageMaker access. This role is used to give SageMaker access to your data in S3 will automatically be obtained from the role used to start the notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "arn:aws:iam::686482943557:role/service-role/AmazonSageMaker-ExecutionRole-20180306T112115\n",
      "CPU times: user 936 ms, sys: 1.38 s, total: 2.32 s\n",
      "Wall time: 986 ms\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "import sagemaker\n",
    "from sagemaker import get_execution_role\n",
    "\n",
    "role = get_execution_role()\n",
    "print(role)\n",
    "sess = sagemaker.Session()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We also need the S3 bucket that you want to use for training and to store the tranied model artifacts. In this notebook, we require a custom bucket that exists so as to keep the naming clean. You can end up using a default bucket that SageMaker comes with as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "bucket = '<your_s3_bucket_name_here>' # custom bucket name.\n",
    "# bucket = sess.default_bucket()\n",
    "prefix = 'DEMO-ObjectDetection'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "433757028032.dkr.ecr.us-west-2.amazonaws.com/object-detection:latest\n"
     ]
    }
   ],
   "source": [
    "from sagemaker.amazon.amazon_estimator import get_image_uri\n",
    "\n",
    "training_image = get_image_uri(sess.boto_region_name, 'object-detection', repo_version=\"latest\")\n",
    "print (training_image)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data Preparation\n",
    "[MS COCO](http://cocodataset.org/#download) is a large-scale dataset for multiple computer vision tasks, including object detection, segmentation, and captioning. In this notebook, we will use the object detection dataset. Since the COCO is relative large dataset, we will only use the the validation set from 2017 and split them into training and validation sets. The data set from 2017 contains 5000 images with objects from 80 categories.\n",
    "\n",
    "### Datset License\n",
    "The annotations in this dataset belong to the COCO Consortium and are licensed under a Creative Commons Attribution 4.0 License. The COCO Consortium does not own the copyright of the images. Use of the images must abide by the Flickr Terms of Use. The users of the images accept full responsibility for the use of the dataset, including but not limited to the use of any copies of copyrighted images that they may create from the dataset. Before you use this data for any other purpose than this example, you should understand the data license, described at http://cocodataset.org/#termsofuse\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Download data\n",
    "Let us download the 2017 validation datasets from COCO and then unpack them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import os\n",
    "import urllib.request\n",
    "\n",
    "def download(url):\n",
    "    filename = url.split(\"/\")[-1]\n",
    "    if not os.path.exists(filename):\n",
    "        urllib.request.urlretrieve(url, filename)\n",
    "\n",
    "\n",
    "# MSCOCO validation image files\n",
    "download('http://images.cocodataset.org/zips/val2017.zip')\n",
    "download('http://images.cocodataset.org/annotations/annotations_trainval2017.zip')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "unzip -qo val2017.zip\n",
    "unzip -qo annotations_trainval2017.zip\n",
    "rm val2017.zip annotations_trainval2017.zip"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before using this dataset, we need to perform some data cleaning. The algorithm expects the dataset in a particular JSON format. The COCO dataset, while containing annotations in JSON, does not follow our specifications. We will use this as an opportunity to introduce our JSON format by performing this convertion. To begin with we create appropriate directories for training images, validation images, as well as the annotation files for both."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "#Create folders to store the data and annotation files\n",
    "mkdir generated train train_annotation validation validation_annotation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Prepare dataset \n",
    "\n",
    "Next, we should convert the annotation file from the COCO dataset into json annotation files. We will require one annotation for each image.\n",
    "\n",
    "The Amazon SageMaker Object Detection algorithm expects lables to be indexed from `0`. It also expects lables to be unique, successive and not skip any integers. For instance, if there are ten classes, the algorithm expects and the labels only be in the set `[0,1,2,3,4,5,6,7,8,9]`. \n",
    "\n",
    "In the COCO validation set unfortunately, the labels do not satistify this requirement. Some indices are skipped and the labels start from `1`. We therefore need a mapper that will convert this index system to our requirement. Let us create a generic mapper therefore that could also be used to other datasets that might have nonunique or even string labels. All we need in a dictionary that would create a key-value mapping where an original label is hashed to a label that we require. Consider the following method that returns such a dictionary for the COCO validation dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import json\n",
    "import logging\n",
    "\n",
    "def get_coco_mapper():\n",
    "    original_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20,\n",
    "                    21, 22, 23, 24, 25, 27, 28, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,\n",
    "                    41, 42, 43, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,\n",
    "                    61, 62, 63, 64, 65, 67, 70, 72, 73, 74, 75, 76, 77, 78, 79, 80,\n",
    "                    81, 82, 84, 85, 86, 87, 88, 89, 90]\n",
    "    iter_counter = 0\n",
    "    COCO = {}\n",
    "    for orig in original_list:\n",
    "        COCO[orig] = iter_counter\n",
    "        iter_counter += 1\n",
    "    return COCO"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us use this dictionary, to create a look up method. Let us do so in a way that any dictionary could be used to create this method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def get_mapper_fn(map):  \n",
    "    def mapper(in_category):\n",
    "        return map[in_category]\n",
    "    return mapper\n",
    "\n",
    "fix_index_mapping = get_mapper_fn(get_coco_mapper())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The method `fix_index_mapping` is essentially a look-up method, which we can use to convert lables. Let us now iterate over every annotation in the COCO dataset and prepare our data. Note how the keywords are created and a structure is established. For more information on the JSON format details, refer the [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/object-detection.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "file_name = './annotations/instances_val2017.json'\n",
    "with open(file_name) as f:\n",
    "    js = json.load(f)\n",
    "    images = js['images']\n",
    "    categories = js['categories']\n",
    "    annotations = js['annotations']\n",
    "    for i in images:\n",
    "        jsonFile = i['file_name']\n",
    "        jsonFile = jsonFile.split('.')[0]+'.json'\n",
    "        \n",
    "        line = {}\n",
    "        line['file'] = i['file_name']\n",
    "        line['image_size'] = [{\n",
    "            'width':int(i['width']),\n",
    "            'height':int(i['height']),\n",
    "            'depth':3\n",
    "        }]\n",
    "        line['annotations'] = []\n",
    "        line['categories'] = []\n",
    "        for j in annotations:\n",
    "            if j['image_id'] == i['id'] and len(j['bbox']) > 0:\n",
    "                line['annotations'].append({\n",
    "                    'class_id':int(fix_index_mapping(j['category_id'])),\n",
    "                    'top':int(j['bbox'][1]),\n",
    "                    'left':int(j['bbox'][0]),\n",
    "                    'width':int(j['bbox'][2]),\n",
    "                    'height':int(j['bbox'][3])\n",
    "                })\n",
    "                class_name = ''\n",
    "                for k in categories:\n",
    "                    if int(j['category_id']) == k['id']:\n",
    "                        class_name = str(k['name'])\n",
    "                assert class_name is not ''\n",
    "                line['categories'].append({\n",
    "                    'class_id':int(j['category_id']),\n",
    "                    'name':class_name\n",
    "                })\n",
    "        if line['annotations']:\n",
    "            with open(os.path.join('generated', jsonFile),'w') as p:\n",
    "                json.dump(line,p)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import os\n",
    "import json\n",
    "jsons = os.listdir('generated')\n",
    "\n",
    "print ('There are {} images have annotation files'.format(len(jsons)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After removing the images without annotations, we have 4952 annotated images. Let us split this dataset and create our training and validation datasets, with which our algorithm will train. To do so, we will simply split the dataset into training and validation data and move them to their respective folders."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import shutil\n",
    "\n",
    "train_jsons = jsons[:4452]\n",
    "val_jsons = jsons[4452:]\n",
    "\n",
    "#Moving training files to the training folders\n",
    "for i in train_jsons:\n",
    "    image_file = './val2017/'+i.split('.')[0]+'.jpg'\n",
    "    shutil.move(image_file, './train/')\n",
    "    shutil.move('./generated/'+i, './train_annotation/')\n",
    "\n",
    "#Moving validation files to the validation folders\n",
    "for i in val_jsons:\n",
    "    image_file = './val2017/'+i.split('.')[0]+'.jpg'\n",
    "    shutil.move(image_file, './validation/')\n",
    "    shutil.move('./generated/'+i, './validation_annotation/')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Upload to S3\n",
    "Next step in this process is to upload the data to the S3 bucket, from which the algorithm can read and use the data. We do this using multiple channels. Channels are simply directories in the bucket that differentiate between training and validation data. Let us simply call these directories `train` and `validation`. We will therefore require four channels: two for the data and two for annotations, the annotations ones named with the suffixes `_annotation`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "train_channel = prefix + '/train'\n",
    "validation_channel = prefix + '/validation'\n",
    "train_annotation_channel = prefix + '/train_annotation'\n",
    "validation_annotation_channel = prefix + '/validation_annotation'\n",
    "\n",
    "sess.upload_data(path='train', bucket=bucket, key_prefix=train_channel)\n",
    "sess.upload_data(path='validation', bucket=bucket, key_prefix=validation_channel)\n",
    "sess.upload_data(path='train_annotation', bucket=bucket, key_prefix=train_annotation_channel)\n",
    "sess.upload_data(path='validation_annotation', bucket=bucket, key_prefix=validation_annotation_channel)\n",
    "\n",
    "s3_train_data = 's3://{}/{}'.format(bucket, train_channel)\n",
    "s3_validation_data = 's3://{}/{}'.format(bucket, validation_channel)\n",
    "s3_train_annotation = 's3://{}/{}'.format(bucket, train_annotation_channel)\n",
    "s3_validation_annotation = 's3://{}/{}'.format(bucket, validation_annotation_channel)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we need to setup an output location at S3, where the model artifact will be dumped. These artifacts are also the output of the algorithm's traning job."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "s3_output_location = 's3://{}/{}/output'.format(bucket, prefix)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Training\n",
    "Now that we are done with all the setup that is needed, we are ready to train our object detector. To begin, let us create a ``sageMaker.estimator.Estimator`` object. This estimator will launch the training job."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "od_model = sagemaker.estimator.Estimator(training_image,\n",
    "                                         role, \n",
    "                                         train_instance_count=1, \n",
    "                                         train_instance_type='ml.p3.2xlarge',\n",
    "                                         train_volume_size = 50,\n",
    "                                         train_max_run = 360000,\n",
    "                                         input_mode = 'File',\n",
    "                                         output_path=s3_output_location,\n",
    "                                         sagemaker_session=sess)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The object detection algorithm at its core is the [Single-Shot Multi-Box detection algorithm (SSD)](https://arxiv.org/abs/1512.02325). This algorithm uses a `base_network`, which is typically a [VGG](https://arxiv.org/abs/1409.1556) or a [ResNet](https://arxiv.org/abs/1512.03385). The Amazon SageMaker object detection algorithm supports VGG-16 and ResNet-50 now. It also has a lot of options for hyperparameters that help configure the training job. The next step in our training, is to setup these hyperparameters and data channels for training the model. Consider the following example definition of hyperparameters. See the SageMaker Object Detection [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/object-detection.html) for more details on the hyperparameters.\n",
    "\n",
    "One of the hyperparameters here for instance is the `epochs`. This defines how many passes of the dataset we iterate over and determines that training time of the algorithm. For the sake of demonstration let us run only `30` epochs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "od_model.set_hyperparameters(base_network='resnet-50',\n",
    "                             use_pretrained_model=1,\n",
    "                             num_classes=80,\n",
    "                             mini_batch_size=16,\n",
    "                             epochs=30,\n",
    "                             learning_rate=0.001,\n",
    "                             lr_scheduler_step='10',\n",
    "                             lr_scheduler_factor=0.1,\n",
    "                             optimizer='sgd',\n",
    "                             momentum=0.9,\n",
    "                             weight_decay=0.0005,\n",
    "                             overlap_threshold=0.5,\n",
    "                             nms_threshold=0.45,\n",
    "                             image_shape=512,\n",
    "                             label_width=600,\n",
    "                             num_training_samples=4452)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that the hyperparameters are setup, let us prepare the handshake between our data channels and the algorithm. To do this, we need to create the `sagemaker.session.s3_input` objects from our data channels. These objects are then put in a simple dictionary, which the algorithm consumes. Notice that here we use a `content_type` as `image/jpeg` for the image channels and the annoation channels. Notice how unlike the [RecordIO format](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/object_detection_recordio_format.ipynb), we use four channels here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "train_data = sagemaker.session.s3_input(s3_train_data, distribution='FullyReplicated', \n",
    "                        content_type='image/jpeg', s3_data_type='S3Prefix')\n",
    "validation_data = sagemaker.session.s3_input(s3_validation_data, distribution='FullyReplicated', \n",
    "                             content_type='image/jpeg', s3_data_type='S3Prefix')\n",
    "train_annotation = sagemaker.session.s3_input(s3_train_annotation, distribution='FullyReplicated', \n",
    "                             content_type='image/jpeg', s3_data_type='S3Prefix')\n",
    "validation_annotation = sagemaker.session.s3_input(s3_validation_annotation, distribution='FullyReplicated', \n",
    "                             content_type='image/jpeg', s3_data_type='S3Prefix')\n",
    "\n",
    "data_channels = {'train': train_data, 'validation': validation_data, \n",
    "                 'train_annotation': train_annotation, 'validation_annotation':validation_annotation}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have our `Estimator` object, we have set the hyperparameters for this object and we have our data channels linked with the algorithm. The only remaining thing to do is to train the algorithm. The following cell will train the algorithm. Training the algorithm involves a few steps. Firstly, the instances that we requested while creating the `Estimator` classes are provisioned and are setup with the appropriate libraries. Then, the data from our channels are downloaded into the instance. Once this is done, the training job begins. The provisioning and data downloading will take time, depending on the size of the data. Therefore it might be a few minutes before we start getting data logs for our training jobs. The data logs will also print out Mean Average Precision (mAP) on the validation data, among other losses, for every run of the dataset once or one epoch. This metric is a proxy for the quality of the algorithm. \n",
    "\n",
    "Once the job has finished a \"Job complete\" message will be printed. The trained model can be found in the S3 bucket that was setup as `output_path` in the estimator."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "od_model.fit(inputs=data_channels, logs=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Hosting\n",
    "Once the training is done, we can deploy the trained model as an Amazon SageMaker real-time hosted endpoint. This will allow us to make predictions (or inference) from the model. Note that we don't have to host on the same insantance (or type of instance) that we used to train. Training is a prolonged and compute heavy job that require a different of compute and memory requirements that hosting typically do not. We can choose any type of instance we want to host the model. In our case we chose the `ml.p3.2xlarge` instance to train, but we choose to host the model on the less expensive cpu instance, `ml.m4.xlarge`. The endpoint deployment can be accomplished as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "object_detector = od_model.deploy(initial_instance_count = 1,\n",
    "                                 instance_type = 'ml.m4.xlarge')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Inference\n",
    "Now that the trained model is deployed at an endpoint that is up-and-running, we can use this endpoint for inference. To do this, let us download an image from [PEXELS](https://www.pexels.com/) which the algorithm has so-far not seen. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "!wget -O test.jpg https://images.pexels.com/photos/980382/pexels-photo-980382.jpeg\n",
    "file_name = 'test.jpg'\n",
    "\n",
    "with open(file_name, 'rb') as image:\n",
    "    f = image.read()\n",
    "    b = bytearray(f)\n",
    "    ne = open('n.txt','wb')\n",
    "    ne.write(b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us use our endpoint to try to detect objects within this image. Since the image is `jpeg`, we use the appropriate `content_type` to run the prediction job. The endpoint returns a JSON file that we can simply load and peek into."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "object_detector.content_type = 'image/jpeg'\n",
    "results = object_detector.predict(b)\n",
    "detections = json.loads(results)\n",
    "print (detections)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The results are in a format that is similar to the input .lst file (See [RecordIO Notebook](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/object_detection_recordio_format.ipynb) for more details on the .lst file definition. )with an addition of a confidence score for each detected object. The format of the output can be represented as `[class_index, confidence_score, xmin, ymin, xmax, ymax]`. Typically, we don't consider low-confidence predictions.\n",
    "\n",
    "We have provided additional script to easily visualize the detection outputs. You can visulize the high-confidence preditions with bounding box by filtering out low-confidence detections using the script below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def visualize_detection(img_file, dets, classes=[], thresh=0.6):\n",
    "        \"\"\"\n",
    "        visualize detections in one image\n",
    "        Parameters:\n",
    "        ----------\n",
    "        img : numpy.array\n",
    "            image, in bgr format\n",
    "        dets : numpy.array\n",
    "            ssd detections, numpy.array([[id, score, x1, y1, x2, y2]...])\n",
    "            each row is one object\n",
    "        classes : tuple or list of str\n",
    "            class names\n",
    "        thresh : float\n",
    "            score threshold\n",
    "        \"\"\"\n",
    "        import random\n",
    "        import matplotlib.pyplot as plt\n",
    "        import matplotlib.image as mpimg\n",
    "\n",
    "        img=mpimg.imread(img_file)\n",
    "        plt.imshow(img)\n",
    "        height = img.shape[0]\n",
    "        width = img.shape[1]\n",
    "        colors = dict()\n",
    "        for det in dets:\n",
    "            (klass, score, x0, y0, x1, y1) = det\n",
    "            if score < thresh:\n",
    "                continue\n",
    "            cls_id = int(klass)\n",
    "            if cls_id not in colors:\n",
    "                colors[cls_id] = (random.random(), random.random(), random.random())\n",
    "            xmin = int(x0 * width)\n",
    "            ymin = int(y0 * height)\n",
    "            xmax = int(x1 * width)\n",
    "            ymax = int(y1 * height)\n",
    "            rect = plt.Rectangle((xmin, ymin), xmax - xmin,\n",
    "                                 ymax - ymin, fill=False,\n",
    "                                 edgecolor=colors[cls_id],\n",
    "                                 linewidth=3.5)\n",
    "            plt.gca().add_patch(rect)\n",
    "            class_name = str(cls_id)\n",
    "            if classes and len(classes) > cls_id:\n",
    "                class_name = classes[cls_id]\n",
    "            plt.gca().text(xmin, ymin - 2,\n",
    "                            '{:s} {:.3f}'.format(class_name, score),\n",
    "                            bbox=dict(facecolor=colors[cls_id], alpha=0.5),\n",
    "                                    fontsize=12, color='white')\n",
    "        plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the sake of this notebook, we used a small portion of the COCO dataset for training and trained the model with only a few (30) epochs. This implies that the results might not be optimal. To achieve better detection results, you can try to use the more data from COCO dataset and train the model for more epochs. Tuning the hyperparameters, such as `mini_batch_size`, `learning_rate`, and `optimizer`, also helps to get a better detector."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "object_categories = ['person', 'bicycle', 'car',  'motorbike', 'aeroplane', 'bus', 'train', 'truck', 'boat', \n",
    "                     'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog',\n",
    "                     'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag',\n",
    "                     'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat',\n",
    "                     'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup',\n",
    "                     'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot',\n",
    "                     'hot dog', 'pizza', 'donut', 'cake', 'chair', 'sofa', 'pottedplant', 'bed', 'diningtable',\n",
    "                     'toilet', 'tvmonitor', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven',\n",
    "                     'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier',\n",
    "                     'toothbrush']\n",
    "# Setting a threshold 0.20 will only plot detection results that have a confidence score greater than 0.20.\n",
    "threshold = 0.20\n",
    "\n",
    "# Visualize the detections.\n",
    "visualize_detection(file_name, detections['prediction'], object_categories, threshold)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Delete the Endpoint\n",
    "Having an endpoint running will incur some costs. Therefore as a clean-up job, we should delete the endpoint."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "sagemaker.Session().delete_endpoint(object_detector.endpoint)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_mxnet_p36",
   "language": "python",
   "name": "conda_mxnet_p36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  },
  "notice": "Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.  Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License."
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
